Memory-level parallelism

Memory Level Parallelism or MLP is a term in computer architecture referring to the ability to have pending multiple memory operations, in particular cache misses or translation lookaside buffer misses, at the same time.

In a single processor, MLP may be considered a form of ILP, instruction level parallelism. However, ILP is often mixed up with superscalar, the ability to execute more than one instruction at the same time. E.g. a processor such as the Intel Pentium Pro is five-way superscalar, with the ability to start executing five different microinstructions in a given cycle, but it can handle four different cache misses for up to 20 different load microinstructions at any time.

It is possible to have a machine that is not superscalar but which nevertheless has high MLP.

Arguably a machine that has no ILP, which is not superscalar, which executes one instruction at a time in a non-pipelined manner, but which performs hardware prefetching (not software instruction level prefetching) exhibits MLP (due to multiple prefetches outstanding) but not ILP. This is because there are multiple memory _operations_ outstanding, but not _instructions_. Instructions are often mixed up with operations.

Furthermore, multiprocessor and multithreaded computer systems may be said to exhibit MLP and ILP due to parallelism - but not intra-thread, single process, ILP and MLP. Often, however, we restrict the terms MLP and ILP to refer to extracting such parallelism from what appears to be non-parallel single threaded code.

References

"Enhancing memory level parallelism via recovery-free value prediction." H. Zhou and T. M. Conte. Proceedings of the 17th Annual International Conference on Supercomputing, ICS 2003.
"A Case for MLP-Aware Cache Replacement", Moinuddin K. Qureshi, Daniel N. Lynch, Onur Mutlu, Yale N. Patt. Proceedings of the 33rd annual International Symposium on Computer Architecture (ISCA), 2006.
"MLP-Aware Runahead Threads în a Simultaneous Multithreading Processor". Craeynest, K. Van, S. Eyerman, L. Eeckhout. Proc. of The 4th HiPEAC Int. Conf., Paphos, Cyprus, January 2009.
"Microarchitecture optimizations for exploiting memory-level parallelism", Yuan Chou, B. Fahs, and S. Abraham, Computer Architecture, 2004. Proceedings. 31st Annual International Symposium on 2004.
"Coming challenges in microarchitecture and architecture", Ronen, R.; Mendelson, A.; Lai, K.; Shih-Lien Lu; Pollack, F.; Shen, J.P. Proceedings of the IEEE Volume: 89 Issue: 3 Mar 2001
"MLP yes! ILP no!" (abstract / slides), A. Glew. In Wild and Crazy Ideas Session, 8th International Conference on Architectural Support for Programming Languages and Operating Systems, October 1998.

Parallel computing

General	Cloud computing · High-performance computing · Cluster computing · Distributed computing · Grid computing

Levels	Bit · Instruction · Data · Task

Threads	Superthreading · Hyperthreading

Theory	Amdahl's law · Gustafson's law · Cost efficiency · Karp–Flatt metric · slowdown · speedup

Elements	Process · Thread · Fiber · PRAM · Instruction window

Coordination	Multiprocessing · Multithreading (computer architecture) · Memory coherency · Cache coherency · Cache invalidation · Barrier · Synchronization · Application checkpointing

Programming	Models (Implicit parallelism · Explicit parallelism · Concurrency) · Flynn's taxonomy (SISD • SIMD • MISD • MIMD (SPMD)) · Thread (computer science) · Non-blocking algorithm

Hardware	Multiprocessor (Symmetric · Asymmetric) · Memory (NUMA · COMA · distributed · shared · distributed shared) · SMT MPP · Superscalar · Vector processor · Supercomputer · Beowulf

APIs	Ateji PX · POSIX Threads · OpenMP · OpenHMPP · PVM · MPI · UPC · Intel Threading Building Blocks · Boost.Thread · Global Arrays · Charm++ · Cilk · Co-array Fortran · OpenCL · CUDA · Dryad · DryadLINQ

Problems	Embarrassingly parallel · Grand Challenge · Software lockout · Scalability · Race conditions · Deadlock · Livelock · Deterministic algorithm · Parallel slowdown

Category · Commons